NeMa: Fast Graph Search with Label Similarity

نویسندگان

  • Arijit Khan
  • Yinghui Wu
  • Charu C. Aggarwal
  • Xifeng Yan
چکیده

It is increasingly common to find real-life data represented as networks of labeled, heterogeneous entities. To query these networks, one often needs to identify the matches of a given query graph in a (typically large) network modeled as a target graph. Due to noise and the lack of fixed schema in the target graph, the query graph can substantially differ from its matches in the target graph in both structure and node labels, thus bringing challenges to the graph querying tasks. In this paper, we propose NeMa (Network Match), a neighborhood-based subgraph matching technique for querying real-life networks. (1) To measure the quality of the match, we propose a novel subgraph matching cost metric that aggregates the costs of matching individual nodes, and unifies both structure and node label similarities. (2) Based on the metric, we formulate the minimum cost subgraph matching problem. Given a query graph and a target graph, the problem is to identify the (top-k) matches of the query graph with minimum costs in the target graph. We show that the problem is NP-hard, and also hard to approximate. (3) We propose a heuristic algorithm for solving the problem based on an inference model. In addition, we propose optimization techniques to improve the efficiency of our method. (4) We empirically verify that NeMa is both effective and efficient compared to the keyword search and various state-of-the-art graph querying techniques.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

MSQ-Index: A Succinct Index for Fast Graph Similarity Search

Graph similarity search has received considerable attention in many applications, such as bioinformatics, data mining, pattern recognition, and social networks. Existing methods for this problem have limited scalability because of the huge amount of memory they consume when handling very large graph databases with millions or billions of graphs. In this paper, we study the problem of graph simi...

متن کامل

LaSaS: an Aggregated Search based Graph Matching Approach

Graph querying is crucial to fully exploit the knowledge within the widely used graph datasets. However, graph datasets are usually noisy which makes the approximate graph matching tools favored to overcome restrictive query answering. In this paper, we introduce a new framework of approximate graph matching based on aggregated search called Label and Structure Similarity Aggregated Search (LaS...

متن کامل

Chemical Similarity Searching with neural graph matching methods

The thesis examines three novel structural similarity methods that employ a network of simple auto-associative neural networks for storing structural information about databases of molecular graphs. This information can be used to discover similarities from a query graph to any of the graphs in the model database. The fast learning and recall ability of the neural network facilitates efficient ...

متن کامل

Deep Multi-label Hashing for Large-Scale Visual Search Based on Semantic Graph

Huge volumes of images are aggregated over time because many people upload their favorite images to various social websites such as Flickr and share them with their friends. Accordingly, visual search from large scale image databases is getting more and more important. Hashing is an efficient technique to large-scale visual content search, and learning-based hashing approaches have achieved gre...

متن کامل

Neighbor-Aware Search for Approximate Labeled Graph Matching using the Chi-Square Statistics

Labeled graphs provide a natural way of representing entities, relationships and structures within real datasets such as knowledge graphs and protein interactions. Applications such as question answering, semantic search, and motif discovery entail efficient approaches for subgraph matching involving both label and structural similarities. Given the NP-completeness of subgraph isomorphism and t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • PVLDB

دوره 6  شماره 

صفحات  -

تاریخ انتشار 2013